04:00
2026-06-18
arxiv.org
large-language-models
JetFlow: Breaking the Scaling Ceiling of Speculative Decoding with Parallel Tree Drafting
Researchers from Hao AI Lab introduced JetFlow, a speculative decoding framework that breaks the scaling ceiling of autoregressive LLMs by combining one-forward drafting efficiency with branch-wise caโฆ